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V-l ■ ABSTRACT 

c/3 , We have used the Rayner & Best (1989) smooth tests of goodness-of-fit to study 

Cd ■ the Gaussianity of the Very Small Array (VSA) data. These tests are designed to be 

^ \ sensitive to the presence of 'smooth' deviations from a given distribution, and are 

• i-H . applied to the data transformed into normalised signal-to-noise eigenmodes. In a pre- 

^^ ' vious work, they have been already adapted and applied to simulated observations of 

JH \ interferometric experiments. In this paper, we extend the practical implementation of 

. . . 1 the method to deal with mosaiced observations, by introducing the Arnoldi algorithm. 

This method permits us to solve large eigenvalue problems with low computational 
cost. 

Out of the 41 published VSA individual pointings dedicated to cosmological 
(CMB) observations, 37 are found to be consistent with Gaussianity, whereas four 
pointings show deviations from Gaussianity. In two of them, these deviations can be 
explained as residual systematic effects of a few visibility points which, when corrected, 
have a negligible impact on the angular power spectrum. The non-Gaussianity found 
in the other two (adjacent) pointings seems to be associated to a local deviation of 
the power spectrum of these fields with respect to the common power spectrum of the 
complete data set, at angular scales of the third acoustic peak [i = 700 — 900). No 
evidence of residual systematics is found in this case, and unsubstracted point sources 
are not a plausible explanation cither. If those visibilities are removed, the differences 
of the new power spectrum with respect to the published one only affect three bins. 
A cosmological analysis based on this new VSA power spectrum alone shows no dif- 
ferences in the parameter constraints with respect to our published results, except for 
the physical baryon density, which decreases by 10 percent. 

Finally, the method has been also used to analyse the VSA observations in the 
Corona Borealis supercluster region. Our method finds a clear deviation (99.82%) with 
respect to Gaussianity in the second-order moment of the distribution, and which can 
not be explained as systematic effects. A detailed study shows that the non-Gaussianity 
is produced in scales of -^ « 500, and that this deviation is intrinsic to the data (in 
the sense that can not be explained in terms of a Gaussian field with a different 
power spectrum). This result is consistent with the Gaussianity studies in the Corona 
Borealis data presented in Genova-Santos et al. (2005), which show a strong decrement 
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1 INTRODUCTION 

The study of the Gaussianity of the primordial den- 
sity fluctuations is a very important tool in constrain- 
ing theories of structure formation. Inside the inflation- 
ary paradigm , there is a huge number of theories (see 
iBartolo et al.l (2004) for a recent review on the subject), 
each one predicting different non-Gaussian signatures. Thus, 
any detection of non-Gaussianity would help to discrimi- 
nate among these scenarios for the generation of cosmo- 
logical perturbations. Because of this reason, the study of 
the Gaussianity of Cosmic Microwave Background (CMB) 
maps is becoming of major importance in modern cos- 
mology. In particular ^ since the publica tion of the first 
year WMAP results JBennett et al.ll2003ll . several groups 
have tested the non-Ga ussian nature of those maps using a 
wide set of techniqiies JKomats u et alj |2003l: ICh iane et a\\ 
[2003^: 'Eriksen et aljl2004allbt IVielva et alJl2004 |Parkll200 i 
[Cruz ct al. 200l). 

Furthermore, there are other reasons showing the im- 
portance of the study of the Gaussianity of the CMB. 
The majority of the inflationary models predict the pri- 
mordial non-Gaussian signal to be smaller than the con- 
tribution from secondary effects such as gravitational lens- 
ing, reionization, Sunyaev-Zel'dovich effect, or the contri- 
bution of local foregrounds or unresolved point sources in 
the maps. Thus, tools to test Gaussianity could be used to 
trace the presence of these foregrounds. For example, the 
analysis of the WMA P data using the bispectrum allowed 
iKomatsu et alJ i2003l) to perform estimates of the source 
number counts of unresolved sources in the 41 GHz channel 
(see alsoJ3onzalcz-Nuovo ct al. (2005)). 

In addition, systematic effects may produce spurious 
detections of non-Gaussianities, so non-Gaussian methods 
could help in characterizing the properties of a given exper- 
iment (e.g. lBandav et alJ l|200 0'll. 

The Gaussianity of the VS A data was alre ady examined 
using several methods i n two separate papers iSavaee et alJ 
I2OO4I : ISmitj]_etjjj |2004). which were based on the data pre- 
sented in [ Taylor et alT llSoOS) and GrainEC ct al. (200l). In 
ISavaee et alj ll2004f) . a selection of non-Gaussianity tests are 
applied to the data. Most of these tests are based on real- 
space statistics and are applied to the maximum-entropy 
reconstruction of the regions observed by the instrument. In 
[Smith ct al. (2004), the analysis is devoted to the study of 
the bispectrum of the VSA data, showing how this statistic 
can be obtained in the case of interferometric experiments. 

In this paper, we present the results of a Gaussianity 
analysis of the complete set of observations of the Very Small 
Array (V SA) dedicated to meas ure the CMB power spec- 
trum fsee lPickinson et al.l (12004^ and references therein), as 
well as an analysis of the data from the Corona-Borealis su - 
percluster survey presented in iGenova-Santos et alj (|2003)- 
Here, we will complement the previous Gaussianity studies 
of the VSA data by considering a different family of meth- 
ods, called the Smooth Tests of Goodness of Fit (STGOF). 

In section 2, we give a brief overview of the VSA experi- 
ment. In Section 3, we review the Smooth Tests of Goodness 
of Fit methods, and how these method can be adapted to the 
study of the Gaussianity of interferometric experiments. Sec- 
tion 4 describes how these methods can be further adapted 
to deal with large datasets or mosaiced observations. Section 



5 presents the calibration of the method using Gaussian sim- 
ulations of mosaiced observations with the VSA. Section 6 
presents the results of our analysis, and finally conclusions 
are presented in section 7. 



2 THE VERY SMALL ARRAY 

The VSA is a 14-element heterodyne interferometer sited 
at the Teide Observatory (Tenerife). The instrument is de- 
signed to image the CMB on scales going from 2° to 10', 
and operates at frequencies between 26 and 36 GHz with a 
1.5 GHz bandwidth and a system temperature of ~ 30 K. 
The VSA has observed in two configurations of antennas. 
The first one is the so-called 'compact configuration', which 
covers the multipole range t. ~ 150 — 900 with a pri- 
mary beam of 4.6°-FWHM at 34.1 GHz. This configura- 
tion was used during the first observing season (Septem- 
ber 2000- Septe mber 2001). The res u lts of this campaig n 
are presented in [Watson gtjajjj|2(j(jjj)JTajdQr et^l] l)2003fl : 
IScott et ail IJ200.?) and'Rubino-Mart m et alJ (120031 ). 

The second one, the 'extended configuration', provided 
observations up to ^ = 1500 with a primary beam of 2.10°- 
FWHM (at 33 GHz) and an angular resolution of 11 ar- 
cmin during two separate campaigns. Those results were pre - 
sented in tw o separ ate sets of papers: iGrainge et al.l i2003l) : 
ISlosar et al.l l|2003r for the second season of observation s 
(September 2001 - A pril 2002); and lDickinson et alJ J2004l) : 
iRebolo et all ll2004l) for the third one (April 2002- January 
2003), where we obtained maps both at 34.1 GHz and at 
33 GHz. With this extended configuration, we have obtained 
maps of a complete, X-ray flux-limited sample of seven clus- 
ters with redshifts z < 0.1 iJLancaster et al.ll2005^ . We have 
also produced imaging at 33 GHz of the Corona- Borealis 
supercluster (Gonova-Santos ot al. 2005), with the aim of 
searching for Sunyaev-Zel'dovich detections from a possible 
extended signal due to diffuse warm/hot gas. As shown in 
that paper, we found a strong decrement near the centre of 
the supercluster, which can not be associated either with pri- 
mordial CMB fluctuations or with a SZ effect from a known 
cluster of galaxies in the region. Therefore, we shall consider 
these data in our Gaussianity analysis as well. 

In Table we summarise the whole set of observations 
obtained with the VSA and used for cosmological studies, 
both with its compact and extended configurations. The full 
dataset comprises 8 fields observed with the compact array, 
and 33 fields observed with the extended one. This dataset 
can be arranged into seven separate (not overlapping) re- 
gions on sky, each of them obtained using mosaicing of in- 
dividual pointings. Each mosaiced field is labeled as "VSA" 
plus a number. Within each mosaic, the names of the in- 
dividual pointings are denoted by either no suffix, or the 
suffixes A, B, -OFF, E, F, G, H, J, K and L. Detailed in- 
formation about the fields (central coordinates, integration 
times and maps) can be found in the indicated references. 
All these regions were carefully chosen to minimise contami- 
nation from Galactic emission and bright radio sources. Fur- 
ther details of the residual contam ination in the maps can 
be found in IPickinson et alJ (|200J), and det ails about the 
VSA o bservational technique can be found in I Watson et alJ 
J2003h . 

Regarding the Corona Borealis observations, the core of 



the supercluster is imaged with a 9 pointings mosaic, and we 
have two additional pointings outside this region to map two 
supercluster members which lie far from the optical centre of 
the supercluster. The total area covered is ~ 24 deg^ with 
an angular resolution of 11 arcmin and a sensitivity of 12 
mjy/beam. Detailed information about the fields (central 
coordinates, integrat i on tim es and maps) can be found in 
iGenova-Santos et alJ (l2005l) . 



2.1 Interferometer measurements 

For observations of small patches of sky, we can adopt the 
flat-sky approximation and use Fourier analysis instead of 
the spherical harmonic expansion for the temperature fleld. 
In this limit, the complex visibility (which gives the response 
of an interferometer observing at frequency z/) can be written 
as 



V{u,u) 



P{x, v)B(x, v) exp {i2nu ■ x)dx 



(1) 



where x is the angular position of the observed point on the 
sky; u is the baseline vector in units of the wavelength of 
the observed radiation (so 2iiu is the Fourier mode); P{x, i^) 
is the primary beam of the antennas (normalised to unity 
at its peak); and B(x,v) is the brightness distribution on 
the sky. For the case of CMB observations, this brightness 
can be expressed in terms of the equivalent thermodynamic 
temperature fluctuations (Ar(i)) as 



B{x,u) 



dB.{T) 



dT 



AT{x) 



(2) 



T=To 



where Bv{T) is the Planck function, and the mean temper- 
aturc of the CMB is given by To = 2.726 K JMather et alJ 
11994) . 

By inserting the Fourier decomposition of the sky 
brightness in equation^ we flnd that an interferometer mea- 
sures the convolution of sky Fourier modes with the aperture 
function (Fourier transform of the primary beam) , sampling 
at those points given by the projection of the baselines on 
the sky plane. 

We note that the previous equation does not take into 
account the contribution of instrumental noise. Thus, for 
a realistic instrument observing at a frequency u, the ith 
baseline Ui of the interferometer will measure the following 
quantity 

d{ui,v) = V{ui, u) + n{ui,v) 

where n{ui,v) stands for the instrumental noise on the Ui 
visibility. 

Let A'^ be the total number of complex visibilities ob- 
served by an interferometer. Then, the complete set of ob- 
served visibilities will be noted as the following vector with 
Nd = 2N elements 

d = {^[d{ui,ui)], . . . ,U[d{uN , i^N)],^[d{ui, ui)], 

. . . ,9[d(ujv,!/jv)]} 

where the label K (Sy) stands for the real (imaginary) part 
of the complex number. We must note that in this equation, 
we explicitly differentiate the observing frequency for each 
observed sample because, in general, we could combine data 
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Table 1. Summary of the VSA observations dedicated to cosmological studies, and which have been analysed in this paper. We present 
the names for individual pointings contributing to one of the 7 VSA mosaics, separating the observations according to the three VSA 
campaigns. The central coordinates, integration times and maps for each of the individual pointings can be found in the three specified 
references. 



Mosaic 


Compact 




Extended 


Extended II 






(Grainsre et al. 2003) 












VSAl 


1, lA, IB 




IE, IF, IG 


IH, IJ, IK, IL 


VSA2 


2, 2-OFF 




2E, 2F, 2G 


2H, 2J, 2K, 2L 


VSA3 


3, 3A, 3B 




3E, 3F, 3G 


3H, 3J, 3K, 3L 


VSA5 


- 




- 


5E, 5F, 5G 


VSA6 


- 




- 


6E, 6F, 6G 


VSA7 


- 




- 


7E, 7F, 7G 


VSA8 


- 




- 


8E, 8F, 8G 



taken at diflferent frequencies with the same instrument (this 
is indeed the case of the VSA, in which we have observations 
at two different frequencies, 33 GHz and 34.1 GHz). 

From here, we can also define the vector for the sky 
signal, and the vector for the noise in the same way, so we 
have d = V + n. Assuming that there are no correlations 
between the sky signal and the noise, the covariance matrix 
for this set of observations can be written as 

C=< dd' >^ S + N 

where S and N are the covariance matrices of the contribu- 
tions from the sky signal and the noise, respectively. In our 
analysis, we shall take this noise covariance matrix as diag- 
onal, as is the case of the VSA data (e.g. Dickinson et alj 
l|2004) V 

The covariance matrix for the CMB component (S) can 
be computed analyti c ally u sing the equations presented in 
iHobson fc Maisingeii l|2002r . both for the case of a single 
or mosaiced observations. If the primary beam of the inter- 
ferometer horns is symmetric respect to inversion through 
the origin (as it is the case of the VSA experiment, where 
the primary beam can be modelled to a good approxima- 
tion by an spherical Gaussian function), then the aperture 
function is real. As a consequence, for the case of single-field 
observations the covariance matrix is block-diagonal (i.e. the 
real and imaginary parts of the visibilities are uncorrelated). 
Note that for mosaiced observations, this is not true in gen- 
eral. 



3 GOODNESS-OF-FIT STATISTICS APPLIED 
TO INTERFEROMETERS 

In this section we summarise some aspects of the smooth 
goodness-of-fit tests applied to CMB interferome ters. For a 
more detailed description, see JAliaga et al.ll2005l hereafter, 
A05). 

As shown in the previous section, the visibilities ob- 
served by an interferometer are correlated quantities. There- 
fore, the STGOF have to be adapted to deal with these data, 
because in their original form, these tests require indepen- 
dent data points. As described in A05, this is done follow- 
ing a two-step procedure. First, t he data d ar e transformed 
into signal-to- noise eigenmodes ^ iBondlll99a) , and they are 
normalised . After this, the smooth goodness-of-fit tests de- 
veloped bv lRavner fc Besti il989f) can be applied to the nor- 



malised eigenmodes, which for the Gaussian case would be 
independent. 



3.1 Signal-to-noise eigenmodes 

In a first step, the data are transformed into signal-to-noise 
eigenmodes as explained in A05. Every eigenmode has an 
associated signal-to-noise eigenvalue, in such a way that the 
higher is the value of the eigenvalue the more signal-to-noise 
ratio is associated to the eigenmode. Thus, this decomposi- 
tion permits us not only to decorrelate the visibilities, but 
also to select those data points in which the signal contri- 
bution is dominating over that of the noise. 

Let Ln be the square root matrix of the noise correla- 
tion matrix (i.e. N = L„i^), and R the rotation matrix 
which diagonalizes the matrix 



A = L-^SLZ 



(3) 



Thus, R^AR = E, where E — diag(i5i, . . . , En^) is a diag- 
onal matrix whose diagonal elements are the signal-to-noise 
eigenvalues (Ei). With these definitions, the signal-to-noise 
eigenmodes are obtained as 



i = R'L-'d 



(4) 



From here, it is easy to show that the covariance matrix 
associated to these variables is given by < ^4' >= E + In^, 
where In^ is the identity matrix with dimension Nd x Nd- 

The normalised signal-to-noise eigenmodes can be de- 
fined from here as yi = $,i/{Ei + 1)^'^ {i = 1, . . . , Nd), and 
one can immediately show that these quantities are uncor- 
related and they verify (yiyj) — 5ij. 

For our case of interest (CMB analyses), the important 
point is that equation^Jis a linear transformation, so it pre- 
serves the Gaussianity of the variables (i.e. if the data d are 
distributed following a multi-normal function, then the nor- 
malised eigenmodes will follow a one-dimensional Gaussian 
distribution A''(0, 1)). Thus, one can now apply the STGOF 
to these transformed variables. 



3.2 Smooth tests of goodness-of-fit 

Let us assume that we have n independent realizations 
{xiYjZJl of a statistical variable x, and we want to test if 
X has a di stribution funct i on com patible with f(x) (null hy- 
pothesis). |Rainer_&_^esg il989f) proposed some statistics to 



discriminate between / and another distribution (alternative 
hypothesis) which deviates smoothly from /. In the case in 
which / is a Gaussian (A'^(0, 1)), it can be shown that the 
first four score statistics associated with the alternative are 
given by 



S. = ^f/f 



with 



where fj.a = {J2'i=i ■ 



(5) 



n{ilif 




n(/i2 - 1)V2 




n(/i3 -3/ii)^/6 




n[(/i4 - 3) - 6{fi2 ' 


- 1)]V24 



(6) 



a. It should be noted that this test is direction 



)/n is the estimated moment of order 

i.e. it indi- 
cates how the actual distribution deviates from Gaussianity. 

If the data are drawn from the distribution given by 
/, and n is large enough (n > 100), then it is possible to 
show that the Uf quantities are distributed following a y^ 
with one degree of freedom. If the data do not follow an 
/ distribution, we expect departures from this distribution, 
and this is the way we detect non-Gaussian signals. 

We shall apply these statistics to the normalised eigen- 
modes described in the last subsection. The important point 
for us is that, as shown in A05, we can select subsets of 
signal-to-noise eigenmodes, according to their associated 
eigenvalue Ei. This will permit us to test if j/i ~ -'V(0, 1) 
(that is, our null hypothesis is that / is a Gaussian func- 
tion). 



4 EXTENDING THE METHOD TO 

MOSAICED FIELDS. THE ARNOLDI 
ALGORITHM 

The method we have applied in the previous sections only 
uses a few eigenmodes (the ones whose eigenvalue is large 
enough), but the correlation matrices are relatively large 
(matrices from 5000x5000 to 10000x10000 for the mosaic 
analysis). This implies a big computational cost in the diago- 
nalization of the matrices. Thus, we pose the following ques- 
tion: is it possible to reduce the dimension of the correlation 
matrix to calculate only those eigenmodes and eigenvalues 
we need? 

There are several numerical methods for solving eigen- 
value problems with large matrices. We shall consider here 
one particular class of methods, named the Krylov sub- 
space methods, and in particular the Arnoldi algorithm. 
This method can be applied to general non-Hermitian ma- 
trices, although we will focus here in the application to our 
problem, in which the covariance matrix is real and symmet- 
ric, an d has a sparse structure. Our development is based on 
ISaadI ljl22d) (see also lAliaeal l|200iTh V 



4.1 The Arnoldi method 

This procedure was introduced as a means of reducing a 
dense matrix into Hessenberg form (lower triangular ma- 
trix). However, the important point for us is that this 



method was shown to be a good technique for approximating 
the eigenvalues of large sparse matrices. 

The basis of the method is as follows. We start from our 
sparse matrix A given in equation |3 which has dimensions 
Nd X Nd- Then, we want to built a new matrix H with 
dimensions m x m, in such a way that m < Nd and the 
eigenvalues/eigenvectors of H should be approximations to 
the eigenvalues/eigenvectors of A (it is obvious that we will 
not recover all the eigenvalues, but only m at the most). 

The algorithm produces a set of vectors, {q^, . . . , Qra}-, 
which form an orthonormal basis of the subspace linear span 
of {qj , Aq^ , ■ . . , A"^~^q^}. One variant of this algorithm for 
real and symmetric matrices will be presented below. From 
these vectors, we can build the following {Nd x m)-matrix 



Q^, = (g,)« 



i = l, 



,Nd; j^l, 



with (q,)i the ith component of the jth vector. The matrix 
in which we are interested can be derived as 

H = Q''AQ (7) 

where f stands for the transpose conjugate matrix^ . The H 
matrix has an upper triangular form, and can be diagonal- 
ized to find its eigenvalues E^ and associated eigenvectors 
y\ . From here, we can build the Ritz approximate eigen- 
vectors associated to E\ as e^ — QVi ■ The important 
point for us is that a fraction of these Ritz eigenvectors are 
a good approximation of the corresponding eigenvectors of 
A, and at the same time, E^ give a good approximation 
to the associated eigenvalue Ei. Moreover, the quality of the 
approximation improves as m increases. 

We note that there are simple analytical expressions for 
the residual norm associated to the Ritz eigenvectors. These 
expressions can be easily implemented in the al gorithm, so 
one controls the quality of the approximations llSaadlll992l : 
lAliagallioOsI) . 



4.2 Signal-to-noise eigenmodes 

We show now how to use signal-to-noise eigenmodes together 
with the Arnoldi method. 

Let it'^-* be the rotation matrix which diagonalizes H. 
R^^' is constructed in such a way that its ith column corre- 
sponds to the y\ eigenmode defined in the previous sub- 



(H) 



section (i.e. it^ ■ = (y 



>(H)^t 



(H) 



;f') 



Then, 



{R'">yHR'"> = diB.g{E["\. . .,Ei^^) = E^"^ (8) 

We now define the matrix T — QR^^\ which has dimen- 
sions Nd X m. Hence, using equations |7| and |H| we have 

T^AT = E^"^ 

Using these matrices, the signal-to-noise eigenmodes 
can be defined as 



^(H) ^J-tj^-l^ 



(9) 



and the corresponding correlation matrix can be written 
as {$}"\S,^"''f) = E^"'> + Irr,. The transformed signal 
and noise vectors are given here by V = T^L^^V and 



^ In our case, A is real, so we could use transpose instead of 
transpose conjugate. 
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h = T^L-^n, such that (VV') = -B*-^' and {fih'^) = /„. 
Therefore, the meaning of the signal-to-noise eigenmodes is 
preserved, and eigenmodes associated to large eigenvalues 
have more contribution from the signal than from the noise. 
As indicated in the previous subsection, some of the 

( H) 

eigenvalues E^ will also be eigenvalues of A to a good ap- 
proximation. Moreover, as we shall see below, those eigenval- 
ues which are better approximated will correspond to high 
values of the signal-to-noise eigenvalue. Thus, this method 
will permit to estimate the eigenvalues (and the associ- 
ated signal-to-noise eigenmodes) which are of interest for 
our analysis, but the dimensionality of the problem will be 
greatly reduced. Finally, let us note that with the previ- 
ous definition of signal-to-noise eigenmodes associated to H, 
they will directly give a good approximation to the corre- 
sponding eigenmodes for A. 



Hj,j+i = \\w\\ 



(13) 
(14) 



In the previous algorithm, the symbol ||.|| represents the Eu- 
clidean norm of a vector, and it is defined as || a; || = (x^x)^", 
where the symbol f stands for the transpose conjugate op- 
erator. 

The above algorithm guarantees that the q^ vectors are 
orthogonal. However, when m is relatively large, the or- 
thonormality of the q^ vectors is lost, so they have to be 
orthonormalised as they are calculated (for example using 
the method of Gram-Schmidt). We note that if A is sparse 
(as is our case), then there are algorithms to optimise the 
product operation Aq^ ^ and theref ore the iterative process 
can be accelerated fsee lSaadl il992h \ 



4.3 Relationship between the matrices which 
diagonalize H and A 

From the definition of the signal-to-noise eigenvectors given 
in equation |5| it is clear that if a pair E^ ,e^ (an eigen- 
value and its associated Ritz eigenvector) is a good approx- 
imation to the corresponding pair for the A matrix, then 
^ will give a good approximation to ^i as well. However, 
there is a sign ambiguity when implementing this algorithm 
in practice. When obtaining the eigenvectors for a given ma- 
trix, we impose that it should be unitary, but an ambiguity 
in the sign is still present (if y is an eigenvector, then —y is 
an eigenvector as well). 

This ambiguity can be avoided by imposing an addi- 
tional condition, for example, that the first component dif- 
ferent from zero of each eigenvector should be positive (in 
fact, due to precision problems, it is better to impose that 
the component which is the maximum in absolute value for 
each eigenvector should be positive). With this criterion, 
then it is cle ar that the stat i stics o f the smooth goodness- 
of-fit tests of iRavner fc Best! il989r can be computed from 
the signal-to-noise eigenmodes of H , yielding exactly the 
same values as those computed from A tf they only involve 
those eigenvectors whose associated eigenvalues are correctly 
approximated. In the next two subsections we present an ex- 
ample of implementation of this method, and we apply it to 
the case of simulated VSA observations. 



4.4 Lanczos algorithm 

This is a particular simplification of the Arnoldi algorithm 
for the case when the considered matrix is Hermitian. In this 
case, it can be shown that the H matrix is real, tridiagonal 
and symmetric. Thus, the algorithm becomes computation- 
ally faster, and fewer variables need to be stored in memory. 
The implementation of the algorithm that we have used here 
is the following: 

(i) Start. We choose an initial unitary vector q^, and we 
define Hqi = 0, Qq = 0. 

(ii) Iterate. For j = 1, . . . , m: 



Mi -Hj-i,jq 
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w - Hjjqj 
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5 CALIBRATION OF THE METHOD WITH 
SIMULATED VSA MOSAICED 
OBSERVATIONS 

The STGOF have been already applied to simulated obser- 
vations of the VSA in A05, showing the ability of the method 
to detect non-Gaussian signals introduced via the Edge- 
worth expansion, cosmic strings or x^-simulations, when we 
use realistic noise levels from the experiment. In that pa- 
per, the analysis was performed by simulating a single VSA 
pointing, and setting the noise levels according to the typi- 
cal integration time of the fields. In order to complete this 
picture, we present here the calibration of the method by 
using simulated Gaussian mosaiced observations. There is 
no difference in practice between analysing single and mo- 
saiced fields, so the results will be similar to those found 
in A05. However, this study will permit us to test the soft- 
ware for computing covariance matrices in mosaiced obser- 
vations. This software evaluates the covariance matrix from 
a given power spectrum, using the equations presented in 
Hobson & Maisinaer (2002). 

A visibility file of an individual pointing typically con- 
tains 10^ — lO'' visibility points. Thus, we need to bin the 
data into cells of certain size prior to the Gaussianity anal- 
ysis. For this paper, we have adopted the same bin size and 
binning proce dure used for the dete rmination of th e power 
spectrum fsee lScott et alJ ll2003h and lGrainge et all 1120031'). 
Thus, the compact array data were binned using a cell size of 
4A, where A is the observing wavelength, whereas for the ex- 
tended array we used 9A. Tests with different cell sizes were 
done in A05, showing that there are no significant changes 
in the results when varying these values. 



5.1 Gaussian simulations 

To illustrate the method in the case of mosaiced observa- 
tions, we used the VSAl mos aic with the extended con- 
figuration llGrainge et alJl2003ll . This mosaic contains three 
individual pointings (VSAIE, VSAIF and VSAIG). After 
bimnng using 9A cells, the data files contain 914, 882 and 
911 complex visibility points, respectively. Thus, the covari- 
ance matrix has in this case a size of Nd = 5414 {N = 2707). 
We performed 10000 Gaussian simulations of this three- 
pointings mosaic, including Gaussian CMB signal plus 



Table 2. Values of the mean ((C/^)) and the standard deviation 
(c) of the statistics Uf for 10000 Gaussian CMB plus noise sim- 
ulations of the VSAl extended mosaic. They are compared with 
the corresponding asymptotic values (for x\)y displayed in last 
column. We show the results for two different values of Ecut- The 
numbers within parenthesis indicate the number of eigenvalues 
with Ei ^ Ecut- 
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Gaussian noise, according to the noise levels of the real ob- 
servations. The power spectrum adopted for the simulations, 
as vyell as for their analysi s, corresponds to the one presented 
in lPickinson et alJ (|2004r i. 

Fig. Q shows the histogram of the obtained Uf values, 
and compares it with the expected Xi distribution. As ex- 
pected, the Ui quantities are distributed following a xi- Ta- 
ble 121 presents the mean value and the standard deviation 
of these Ui quantities. As we will see below, we will focus 
our analysis on those values for the statistics which are com- 
puted using a subset of the signal-to-noise eigenmodes with 
high signal-to-noise ratios, i.e. using only those eigenmodes 
with Eissociated eigenvalues satisfying Ei > Ecut- In partic- 
ular, we will use the value of Ecut = 0.4, so we present here 
the results both for Ecut = and Ecut = 0.4. In this second 
case, we keep only ~ 4% of the data (219 points), so this 
is why the distribution of Ui is slightly broader than the 
asymptotic value of y/2- 

In order to illustrate the sensitivity of the method to the 
power spectrum used in the computation of the covariance 
matrix, we have done the following test . Using the measured 
power spectrum from IPickinson et al.l i2004ll . we have cre- 
ated three mock power spectra, two of them defined by the 
envelope of the 1-sigma error bars of the data, and the third 
one as an intermediate case connecting alternate values of 
+ 1 and —1 sigma, as shown in Figure |5| We call the "upper 
power spectrum" ( "lower power spectrum" ) the one derived 
linking the measured data points plus (minus) one sigma, 
while the "oscillating power spectrum" is the one derived 
linking the alternate measured data points plus/minus one 
sigma. 

We use these power spectra to analyse the previous 
10000 simulations (which were generated using the measured 
power spectrum). The mean values and the standard devi- 
ations of the Ui quantities are shown in Tables E] and 
|S| respectively. These test cases show us how the distribu- 
tion of the Ui quantities is changing due to the use of an 
incorrect power spectrum. Overall deviations of the power 
with respect to the true power spectrum always appear as 
an excess in the t/l statistic. Therefore, such an excess in 
Ui can reflect either an intrinsic non-Gaussianity or a de- 
viation of the local power spectrum from the averaged one 
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Figure 2. Power spectrum used in the simulations. The data 
points correspond to the data presented in Dickinson et al. (2004), 
but no correction due to residual sources and Galactic foregrounds 
have been applied. The solid line is a spline interpolation of the 
data points, whereas the dashed lines are obtained interpolating 
the regions of +1 sigma and —1 sigma. The dotted line corre- 
sponds to an intermediate case, where we connect alternate values 
of plus and minus one sigma. 



Table 3. Same as Table |^ but now the 10000 simulations are 
analysed using the lower fit power spectrum explained in the text. 
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(i.e. anisotropy). However, the third test case shows that if 
our power spectrum deviates from the real one in a "realistic 
way" , then we do not expect a significant effect on the mean 
value of Ul statistic'^. This is an important point, because a 
too strong dependence on the input power spectrum would 
make this method difficult to apply in practice. 



■^ However, we note that for this particular case in which the 
"wrong" power spectrum oscillates around the real one, we find 



< Ul 



0.4, where we have 
>= 0.90 with an error 1.69/\/T0000 Ri 0.02. This shows 



that although the average band power is approximately the same 
as the true value (and thus we do not detect a significant effect 

3WC 

is detected in the higher moments (f/J). 
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Figure 1. Distributions of the [/? statistics, from left to right, top to bottom i = 1, 2, 3, 4, when using only Ei > 0.4. They are obtained 
from 10000 simulated observations of the VSAl mosaic, so each simulation contains three individual pointings which partially overlap 
on sky. The simulations contain Gaussian CMB signal plus the realistic noise achieved in the observations. Prior to the analysis, the 
visibilities are binned in cells of 9A in Fourier space. The solid line shows the expected values from a x\ distribution normalised to the 
total number of simulations. 



Table 4. Same as Table |^ but now the 10000 simulations are 
analysed using the upper fit power spectrum explained in the 
text. 



Table 5. Same as Table |^ but now the 10000 simulations are 
analysed using the oscillating power spectrum explained in the 
text. 
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5.2 Example of application of the Arnoldi method 
to VSA data 

We show now an example of application of the Arnoldi 
method to the VSA data, and we present the result of 
the analysis of the VSA2 mosaic observed in the first cam- 
paign of the extended configuration. We have performed this 



analysis using both the standard method and the Arnoldi 
method. 

This mosaic is built up from three individual pointings, 
with names VSA2E, VSA2F and VSA2G. The total number 
of visibility points for this mosaic, once the data are binned 
into 9A cells, is 2751, so the dimension of the matrix A in 
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Figure 3. Example of application of the Arnoldi method. We 
show the eigenvalues corresponding to the analysis of a mosaiced 
field with 5502 data points. We have applied the Lanczos imple- 
mentation using m = 1000. The solid line shows the eigenvalues 
{Ei) of A derived from the analysis of the full covariance matrix, 
while the dot-dashed line shows the eigenvalues (Ei) of H from 
the Arnoldi method. 



this case is p = 5502. We have considered here the case of 
m = 1000. 

Figure |21 shows the comparison of the eigenvalues de- 
rived using both methods. The eigenvalues of A are repre- 
sented by the solid line, while the eigenvalues of H are dis- 
played by the dot-dashed line. We can see that the highest 
eigenvalues are recovered very well, but when we approach 
to the dimension of the H matrix (m=1000), then the re- 
covered eigenvalues departure from real values. In this par- 
ticular case, all eigenvalues with index i < 773 are recovered 
with a relative error smaller than 0.1%, while for i < 787 the 
error is smaller than 1%. In general, when the exact values 
of the eigenmodes are unknown, the errors are controlled 
with the re sidual norm associated to the Ritz eigenvectors 
JSaaJl992h . 

We note that for Ecut = 0.4, the first 228 eigenvalues are 
inside this cut. As we will see below, we will use this value for 
the analysis. Therefore, for our purposes an analysis using 
the Arnoldi method will provide exactly the same results for 
the Uf quantities as the full analysis. 



6 GAUSSIANITY ANALYSIS OF VSA DATA 

In this section, we present the results of the non-Gaussianity 
analysis of the VSA data using the STGOF. The method has 
been applied to the list of pointings quoted in Section 2, con- 
sidering both the individual pointings separately, or arrang- 
ing them into the corresponding mosaics. For each analysis, 
the Ui statistics were obtained for different values of Ecut, 
ranging from to 0.5. 

In all cases (except for the Corona-Borealis mosaic), 
the analysis was performed using both the standard proce- 
dure (full diagonalization of the covariance matrix) and the 
Lanczos algorithm with m — 1000. We have checked that 
the values of the statistics (Ui) derived from both methods 
are exactly the same for those cuts with Ecut ^ 0.1, and 



there are small differences for 0.1 > Ecut ^ 0.01. However, 
the standard computation was more time consuming than 
the Lanczos algorithm. For example, once the covariance 
matrix is computed, the typical computing time for the di- 
agonalization of a matrix with Nd = 7200 was ~ 1.2 hours 
in a 2.6 GHz Processor with 2 GHz RAM, and this number 
scales roughly as N^. However, the Lanczos algorithm with 
m = 1000 takes only 15 s in the diagonalization step. 

The power spectrum used to compute the covariance 
matrices corresponds to the one derived from the data 
JDickinson et alj 120041) , and presented as a solid line in 
Fig. 121 As mentioned above, the data files are binned in 
the visibility space into cells of 4A for the compact array 
data, and 9A for the extended array data. 

Given the huge number of statistics we obtained, we 
proceed as follows in order to present a comprehensive sum- 
mary of the analysis. We shall present our results only for the 
analyses of the mosaiced observations, and we shall quote 
the values and the significance of the Ui statistics for the 
eigenvalue cut Ecut ~ 0.4. High values of Ecut are desirable 
because we select eigenmodes with higher signal-to-noise ra- 
tios, and for these values the statistical properties of the 
signal are not diluted by the noise. However, if the value of 
Ecut is too high, then we end up with a small number of 
data-points. As shown in A05, for the signal-to-noise ratio 
achieved in a VSA field, this cut for the eigenvalues uses a 
reasonable number of data points (~ 10%), and at the same 
time it gives good results in discriminating non-Gaussian sig- 
nal obtained from the Edgeworth expansion and from string 
simulations. In any case, we have checked that there are no 
significant differences if we change the value of Ecut in the 
range 0.1 to 0.5. 

If a non-Gaussian signal is detected in a given mosaic, 
we follow these steps 

a) We present the values of the statistics for the individ- 
ual fields, in order to identify and localise the pointing(s) 
responsible of the non-Gaussian signal. 

b) Data corresponding to those individual pointings con- 
taining the non-Gaussian signal are split into two parts, cor- 
responding to different epochs of observation. The analysis is 
performed in each one of these two parts, in order to isolate 
possible residual systematic effects. 

c) Those individual pointings are also analysed by split- 
ting the data into separate regions of the uv-plane, so one 
can localise the origin of the signal in Fourier space. To allow 
a simple identification of the Fourier regions, we divided the 
uv-plane into 16 concentric annuli. The edges of these annuli 
are taken from the bins adopted in lPickinson et al.l (1200411 
to present the power spectrum results, and are quoted in 
Table|ni(note that £ = 27r|u|). 

d) The VSA collaboration has always maintained two in- 
dependent pipelines, so every pointing has been reduced in 
parallel by at least two of the three institutions. Thus, if a 
non-Gaussian signal is detected in a given mosaic, we also 
checked the second (independent) version of the reduction 
of the data, to identify if the non-Gaussian signal was due 
to residual systematic effects. 

e) Finally, we have also explored the robustness of the re- 
sults when using different noise estimates for the visibilities. 
Within the VSA collaboration, we have used two different 
noise estimates, one based on daily estimates from the scat- 
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Table 6. Multipole bins considered in our analysis. The quoted 
values correspond to the same bi n limits used in the po wer spec- 
trum estimation, as presented in lPickinson et al.1 1|200^. 
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ter on the visibility data in each baseline, and another one 
based on the scatter of the visibilities when they are binned 
into cells pri or to the power sp ectrum computations (see 
the details in lXavlor et alJ I|20n3r 'l. These two methods have 
been shown to produce consistent results on the power spec- 
tra, but we shall explore here whether those non-Gaussian 
signals could be understood using a different noise charac- 
terization. 



Data has been split according to the configuration of 
the instrument (compact or extended data). This will per- 
mit us to isolate possible systematic effects which could 
be only present in a given configuration. For the extended 
array dataset, we also split the data into t wo subgroups, 
which correspond to the dataset presented in lGrainge et alJ 
112003), and the rest of the VSA extended fields described in 



IPickinson et alJ i2004) . This allows a direct comparison of 
our results for the extended arra y with those from the pre- 
vious papers (iiSavage et alji2004i : ISmith et alJl2004il on the 
same datasets. 

Our results are presented in Table |7| for the case of 
Ecut ~ 0.4. The total number of visibility points in the 
binned files is also shown in the second column of that ta- 
ble. All the results are compatible with a Gaussian distri- 
bution, except in three cases where we obtain values for 
the statistics with a significance greater than 95%. These 
cases are the mosaics for the VSA2 and VSA3 fields ob- 
tained with the first release of the extended configuration 
(fields VSA2E, VSA2F and VSA2G, and VSA3E, VSA3F 
and VSA3G, respectively), and the Corona-Borealis mosaic 
(quoted as CoronaB). In these three cases, there is appar- 
ently a detection with the Ui statistic, which may indicate 
either a deviation from the theoretical power spectrum or 
a non- Gaussian signature (as shown for instance for cosmic 
strings in A05). We now discuss each of these three cases in 
detail. 



6.1 VSA2 mosaic, extended configuration 

An individual analysis of the three fields of the mosaic 
shows that VSA2E is compatible with gaussianity ([/| — 
2.46(88.3%)), while the VSA2F ([7| = 10.3(99.9%)) and 
VSA2G {U'i = 17.0(100%)) present a strong deviation. 

If we now split the VSA2F and VSA2G datasets into 
two parts, corresponding to two separate epochs, we find 
that the non-Gaussian signal is present in both of them, and 
for both pointings. This result suggests that the origin of the 
signal is intrinsic to the data, because it can not be isolated 
in a separated epoch. In order to localise the origin of this 
signal in Fourier space, we performed the analysis using the 
16 anuUi regions mentioned above, and defined in Table |S] 
In order to keep a reasonable number of data points in this 
"bin analysis", we now use a lower value for Ecut{= 0.1). 
The detailed analysis shows that the non-Gaussian signal 
in f/l associated with VSA2F is localised in bins 9 and 11, 
while the non-Gaussianity associated to VSA2G comes from 
bins 9, 10 and 11. 

The following step, in order to cross-check these re- 
sults, is to use the independent reduction of these two 
pointings that we produced in the collaboration. An anal- 
ysis of this second version of the data shows that both 
VSA2F and VSA2G present a deviation on [/|, although 
in the case of the VSA2F pointing this deviation is smaller 
(f/| = 5.1(97.6%)). We have also checked that these non- 
Gaussianity detections are robust against the two different 
schemes for noise estimation. Moreover, varying the noise 
estimates for the data within a ±5% does not change the 
results on the [/I statistic. 



6.1.1 Study of the local power spectrum 

The previous tests suggest that the non- Gaussian signal de- 
tected in the data via the C/| statistic could be real, and not 
due to systematics. Given that C/| may indicate deviations 
on the second moment of the (transformed) visibilities, we 
have investigated the power spectrum of these two fields, as 
well as that of the VSA2EFG mosaic. 

We have used for this computation 8 bins instead of 16 
because the thermal noise in a single VSA field is high, so 
using small amount of visibilities could bias the estimation 
of the power. The 8 bins were obtained by joining the sixteen 
bins from Table |S| in pairs, so one can easily relate the new 
bins with the old ones (i.e. the new bin 1 corresponds to bins 
1 and 2 from that table, and so on). 

The power spectrum of the VSA2EFG field is found to 
be compatible (at the 1-s igma level) with the one presented 
in IPickinson et al.l i2004f) , except in two bins which corre- 
spond with bins 9 & 10, and 11 & 12 from Table|S] and where 
we find a 2. 1-sigma and 1.5-sigma deviation toward higher 
values, respectively (see Fig. 2J- Although noisy, the power 
spectrum of the individual fields (VSA2F and VSA2G) also 
show a deviation on those scales, being larger in the VSA2G 
case (practically 2-sigma). The VSA2E field shows no devi- 
ations. A visual inspection of the VSA2 extended I mosaic 
(see figure 3 in Dickinson et al. 2004) shows an intense pos- 
itive feature close to the centre of the G pointing, which 
could be the responsible for that deviation. 

In order to check if these fields are intrinsically Gaus- 
sian, we have analysed the VSA2EFG mosaic using now 
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Table 7. Values for the C/^ statistics and their corresponding probabilities (within parenthesis) derived from the x^ distribution from 
the analysis of the VSA mosaiced fields, using the cut Ecut = 0.4. Second column shows the number of visibility points (A'^) after binning 
the VSA data, prior to the Gaussianity analysis. The size of the covariance matrix is N,^ X N^, with N^ = 2N. The three cases where we 
have a non-Gaussian detection are marked using bold characters. 
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that the 8 bins used in the power spectrum computation are 
practically independent. Using Monte Carlo simulations, we 
estimate that with the measured band powers and errors, 
the probability of having 2 numbers out of 8 such that one 
of them deviates 2.1-sigma and the other one 1.5-sigma is 
4.6%. If we now impose that these two values should be in 
adjacent bins (as is our case), then this value reduces to 
1.4%. Given that we have observed 13 mosaics with sim- 
ilar characteristics (although not completely independent, 
because mosaics 1, 2 and 3 for different configurations par- 
tially overlap), then we conclude that this deviation of the 
power spectrum is not as significant as the one obtained us- 
ing the [/I statistic. In any case, both analyses suggest that 
we have detected a local deviation of the power spectrum 
(i.e. an anisotropy) in this VSA2 mosaic. 



Figure 4. Power spectrum of the VSA2 mosaic observed with 
the extended I configuration (pointings VSA2E, VSA2F and 
VSA2G). Although noisy (there are only three pointings enter- 
ing in the computation), we can see a deviation with respect to 
the Dickinson et al. result, at angular scales corresponding to 
loilQO- 900. 



its own power spectrum. In this case. 



find the value 



C/| = 1.01(68.38%) for Ecut = 0.4, which is now compat- 
ible with Gaussianity. These results suggest that the [/f 
excess found in this case is connected with a deviation of 
the power spectrum of the region with respect to the aver- 
age one, and not with an intrinsic non-Gaussianity (in the 
sense that when we perform the analysis using their power 
spectrum, then they are Gaussian). 

To complete our study with the STGOF, we have in- 
vestigated how significant is that deviation of the power 
spectrum of the region with respect to the average power 
spectrum. The probability of finding such a deviation in a 
multivariate Gaussian field can be easily derived by noting 



6.2 VSA3 mosaic, extended configuration 

An individual analysis of the three fields contained in the 
mosaic shows that VSA3F is compatible with gaussianity 
(C/| = 0.52(52.8%)), while the VSA3E (f/| = 5.48(98.1%)) 
and VSA3G (f/| = 5.89(98.5%)) present a deviation. 

We proceed as in the previous case, and we first split 
the two datasets (VSA3E and VSA3G) into two separate 
parts corresponding to different observation epochs. We find 
that the non-Gaussian signal connected with VSA3G is only 
present in one part of the data, while the one corresponding 
to VSA3E is absent in both parts. This suggests that at least 
the non-Gaussian signal found in VSA3G with L^| could be 
due to systematic effects, because is only present in one part 
of the data. 

Next, we have examined the results using the two dif- 
ferent noise estimation schemes. Again, varying the noise 
estimates within ±5% does not change the results on [/|. 
However, in this case we find a difference between the two 
methods for the VSA3E and VSA3G pointings. When us- 
ing a noise estimation based on the scatter of the visibili- 
ties when binning all daily observations, we find that now 
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VSA3G is compatible with Gaussianity (f/| = 0.33(43.4%)), 
and VSA3E shows a marginal deviation ([/I ~ 3.85(95.0%)). 
These results suggest that the non-Gaussian signal found in 
this mosaic is in reality produced by residual systematic ef- 
fects associated with a few visibility points which were left 
in the analysis. This is confirmed when using the other inde- 
pendent data reduction of these two pointings, where we find 
that both VSA3E and VSA3G are compatible with Gaus- 
sianity. 

The detailed analysis in separate bins permits us to iso- 
late the origin of this signal, and exclude the affected visibil- 
ities. In both cases, it is connected with one single bin (5 and 
11, respectively). Once these few visibilities are removed, all 
the individual pointings become compatible with Gaussian- 
ity in the two versions of the data reduction, and the joint 
analysis of the three-fields mosaic gives [/| = 1.85(82.7%). 



6.3 Corona Borealis mosaic, extended 
configuration 

The previous analysis of the VSA data shows that the 
STGOF tests are very sensitive methods to detect residual 
systematic effects in the data and/or deviations of the power 
spectrum from the average one. To further check the power 
of this method, we have also applied it to the analysis of the 
data from a survey in the Corona Borealis sup ercluster re- 
gion with the VSA JGenova-Santos et alJl2005h . These data 
are known to present a strong deviation from Gaussianity, 
associated to a negative decrement in the map which can 
not be explained in terms of primordial CMB fluctuations, 
or associated to a (known) cluster of galaxies in the region. 
A power spectrum analysis of these data presents a clear 
deviation with respect to the cosmological one in angular 
scales around (. ~ 500, which corresponds to the angular 
size of that negative decrement. 

Here, we will complete th e Gau ssianity studies de- 
scribed in iGenova-Santos et al.l l|2005f) by performing the 
simultaneous analysis of the data with the STGOF. We 
have analysed the 9 pointings which altogether make a mo- 
saic of the central region of the supercluster. Those point- 
ings are noted with letters A,B,C,D,E,H,I,J and K. Given 
the large size of the full covariance matrix for this dataset 
[Nd = 13258), only the Lanczos method was applied. As 
shown above, for the signal-to-noise ratios achieved in the 
VSA observations, the Lanczos method with m = 1000, 
when applied to covariance matrices with Nd ~ 7000, gives 
the same values of the Uf as those obtained with the full 
analysis if we restrict ourselves to Ecut ^ 0.01 (i.e. 77 % of 
the 1000 eigenvalues are good approximations to the true 
values). In the Corona Borealis case, only ~ 330 eigenvalues 
are found to be above the cut Ecut ~ 0.4, so the use of the 
Lanczos algorithm is justified. 

In the last row of Table |7| we present the results ob- 
tained with the Arnoldi algorithm considering m = 1000 
and Ecut = 0.4. These numbers show a deviation of the [/| 
statistic, as we would expect from the fact that the power 
spectrum of this mosaic differs from the average one. 

We now proceed as in the previous cases, and we first 
analyse all the individual pointings which participate in the 
mosaic. Considered as a whole, each one of the 9 pointings 
seem to be consistent with Gaussianity, although the highest 
Ui value is found for the H pointing ([/| = 3.72(94.6%)) 



which contains the main decrement near its centre. However, 
a detailed analysis in separate bins of all the nine datasets 
shows that the H pointing is the only one presenting a strong 
deviation for Ui, and which is associated with the multipole 
region around £ ~ 500, as we would expect. These results 
are stable when splitting the data in two separate parts, 
so the non-Gaussianity seems to be intrinsic to the data. 
The results are also the same for the two noise estimation 
schemes. 

Finally, we have re-analysed the whole Corona-Borealis 
mosaic using its own power spectrum, in order to probe if 
the detected non-Gaussianity is only associated to the fact 
that we have a deviation of the power spectrum. In this 
case, we obtain that [/| = 7.44(99.4%). This result is very 
interesting, because it is showing that in this case, the non- 
Gaussianity found in the Corona-Borealis mosaic is intrinsic, 
and is not associated to a deviation of the power spectrum: 
even when we use the correct (local) power spectrum of the 
region in the analysis, the C/| is still showing a detection of 
a non-Gaussianity. 



7 DISCUSSION AND CONCLUSIONS 

We have analysed the full VSA data s ets presented in 
iTavlor et al (2003), "Grajng^^OJiopi), [oickinsoniFan 
|200j) and .Genova-Santos et alj J2003), using the Smooth 
Tests of Goodness-of-Fit adapted to interferometer experi- 
ments. This method was described in A05, but here it has 
been extended to deal with large mosaics via the Arnoldi 
method. This numerical method permits to solve large eigen- 
value problems by reducing the dimensionality of the covari- 
ance matrix. We have shown that one implementation of 
this method for Hermitian matrices, the Lanczos algorithm, 
is able to provide good approximations to those eigenvalues 
and eigenmodes of the full covariance matrix with larger 
signal-to-noise ratio. 

From our analysis of the VSA data dedicated to cosmo- 
logical studies, we found that out of the 13 mosaics presented 
in Table [7| eleven of them are consistent with Gaussianity, 
and two of them show a deviation from Gaussianity. In one 
case (mosaic VSA3E -I- VSA3F -I- VSA3G) the non-Gaussian 
signal is shown to be produced by few visibility points which 
contain systematic effects that were not removed properly 
from the data. Once this data are removed, the mosaic be- 
comes compatible with Gaussianity. 

In the second case (mosaic VSA2E -I- VSA2F -|- 
VSA2G), we show that the method is detecting a local de- 
viation of the power spectrum with respect to the average 
one. The STGOF are very sensitive to the power spectrum 
adopted for the computation of the signal-to-noise eigen- 
modes, so small deviations from the correct power spectrum 
are easily detected in the analysis. This ability of the method 
could be used to study the isotropy of a given CMB map. 
Moreover, when the data of this mosaic are analysed with 
their (local) power spectrum, then they are compatible with 
Gaussianity; but when we analyse the whole data set with 
the local power spectrum of the VSA2EFG mosaic, then the 
rest of the mosaics become incompatible with Gaussianity. 
These results could indicate the presence of anisotropy. 

However, there could be other possible explanations, 
such as the presence of residual foregrounds in this partic- 



13 



ular mosaic. Nevertheless, there are no significant features 
seen in multi-frequency foregrounds maps (408 MHz, Ha, 
100 /im dust map) that align with the main features of the 
VSA mosaic. Moreover, this mosaic i s one of the cleanes t 
regions (in terms of rms), as shown in lXavlor et alJ i2003il . 
Regarding the case of point sources, we have investigated 
the possibility that this deviation could be produced by two 
unsubtracted sources with fluxes around 40 mjy. Note that 
a population of unresolved sources is not a possible expla- 
nation, because its contribution should scale as H."^ and it 
would produce deviations on smaller scales as well. How- 
ever, two sources would produce structure as a function of 
£. The value of 40 mJy is an extrapolation to 33 GHz of the 
com pleteness limit of the Ryle teles cope survey at 15 G Hz 
fsee iwaldram et alJ J2003ft . and also lClearv et alJ (1200411 for 
the details of the survey). The 5-sigma limit from Ryle Tele- 
scope at 15 GHz is 10 mJy; the worst case that we can have 
is a rising spectrum source with index of 2, so we considered 
the case of two 40 mJy sources. Using Monte-Carlo simula- 
tions to explore all the possible relative spatial distributions 
of the two sources, we find that we can not explain the mea- 
sured value for f/| in the VSA2G. 

It is also interesting to mention that the VSA2F and 
VSA2G fields are practically contained within the VSA2 
mosaic obtained with the compact array. Although those 
data were obtained with a different configuration, the scales 
where we find a deviation with the extended array were 
also sampled with the compact array (but with poorer 
signal-to-noise ratio), so we could expect a small signature 
in the analysis. However, the value for VSA2 compact is 
[/I = 2.43(88.1%), which is somewhat high but still com- 
patible with Gaussianity. 

We also note that the VSA2 and VSA3 f ields were al- 
ready analysed with other Gaussianity tools in lSavage et alJ 
i2004h and |Smith ct al. (2004), but no evidence of non- 
Gaussian signals was found. This shows the importance of 
applying a wide number of Gaussianity tests to the data, 
given that each particular test is sensitive to a different type 
of non-Gaussianity. 

The fraction of data affected by the systematic effects 
in VSA3 extended I mosaic is too small to affect the pub- 
lished power spectrum by the VSA collaboration. A re- 
evaluation of the complete VSA power spectrum when we 
use the corrected version for these data shows no differences 
with the published values (within the numerical precision 
of the maximum likelihood code). However, the deviation 
in the VSA2 extended I mosaic could influence these num- 
bers. Although there is no justifled reason to exclude this 
VSA2 mosaic from the final computation, we have quanti- 
fied the effect of excluding those data from the final anal- 
ysis. Thus, we have considered the extreme case in which 
we remove from the dataset all visibilities which lie inside 
one of the bins showing the non-Gaussianity. With this new 
dataset, we have re-evaluated the complete power spectrum, 
and c ompared it to the one presented in iDickinson et alJ 
i2004ri in order to obtain the maximum deviations that we 
would expect. Differences (within the numerical precision 
of the power spectrum code) are only found in three bins 
(10, 11 & 12), and are of the order of -9.2%, -4.0% and 
-1-4.7% with respect to the published values. To complete this 
check, we ha ve repeated the para meter estimation analysis 
described in iRebolo et al.l (|200J), but now using this new 



power spectrum. We have considered two different models, 
corresponding to use VSA-I-WMAP data on one hand, and 
VSA-I-COBE data on the other hand, and we have explored 
the 6-para meter flat ACDM model described in Table 2 of 
iRebolo et al. (2004). The differences in all parameters for 
the VSA-I-WMAP case are found to be at the most 2%, so 
we can conclude that the cosmological analyses based on the 
published data are not affected. However, when using only 
the VSA-I-COBE data, we find only a significant difference 
in the estimate of the baryon density, which turns out to be 
Qih^ = 0.030lo.oo5 at 68% C.L., and which should be com- 
pared with the former value Qi,h'^ — 0.033lgQg7. This change 
can be easily understood, because the new power spectrum 
has less power in the region of the third acoustic peak, giv- 
ing a smaller (but compatible) value of the baryon density. 
Note that the new value is now closer to the BEN result, as 
well as to the result of the analysis using WMAP-I-VSA. 

Finally, we have applied the method to VSA observa- 
tions of the Corona Borealis supercluster. These observa- 
tions arc known to present a deviation on the power spec- 
trum produced by a strong decrement in one of the point- 
ings (see Gcnova-Santos et al. (2005)). Our method is able 
to detect this deviation, and it finds a large value for the f/| 
statistic which can not be interpreted as systematic effects. 
A careful analysis shows that the non-Gaussian signal is as- 
sociated with the same scales where we find a deviation on 
the local power spectrum, and which correspond to the an- 
gular scale of the negative decrement. However, in this case 
the non-Gaussian signal detected by the method is intrinsic 
to the data, in the sense that if we use the local power spec- 
trum and we repeat the analysis, a non-Gaussian detection 
is still present. 
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